Fast Robust Inverse Transform SAT and Multi-stage Adaptation

نویسندگان

  • Hubert Jin
  • Spyros Matsoukas
  • Richard Schwartz
  • Francis Kubala
چکیده

We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the di erences between speakers before training, rather than modeling the di erences during training. We develop several methods to avoid the problems associated with inverting the transformation. In one method, we interpolate the transformation matrix with an identity or diagonal transformation. We also apply constraints to the matrix to avoid estimation problems. We show that by using many diagonal-only transformation matrices with constraints we can achieve performance that is comparable to that of the original SAT method at a fraction of the cost. In addition, we describe a multi-stage approach to Maximum Likelihood Linear Regression (MLLR) unsupervised adaptation and we show that is more e ective than a single stage regular MMLR adaptation. As a nal stage, we adapt the resulting model at a ner resolution, using Maximum A Posteriori (MAP) adaptation. With the combination of all the above adaptation methods we obtain a 13.6% overall reduction in WER relative to Speaker Independent (SI) training and decoding.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast robust inverse transform speaker adapted training using diagonal transformations

We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the differences between speakers before training, rather than modeling the differences during training. We develop several methods to avoid the problems associated with inverting th...

متن کامل

Rapid unsupervised speaker adaptation robust in reverberant environment conditions

We expand the conventional rapid adaptation based on Nclosest speakers sufficient statistics (suff stat) to achieve robustness under reverberant conditions. We integrated our fast dereverberation technique based on optimized multi-band spectral subtraction as pre-processing. This removes the late reflection components of the reverberant signal effectively and fast. Speakers’ suff stat are then ...

متن کامل

Is the Sharp Adaptation Transform more plausible than CMCCAT2000?

The modified Bradford chromatic adaptation transform (CMCCAT2000) is a von Kries type model of adaptation that best accounts for a variety of corresponding colour data sets. The transform works in three stages. First, XYZs are linearly mapped to a new ’RGB’ space. The RGB sensitivities are somewhat like the cones but have their sensitivity concentrated in narrower regions of the visible spectru...

متن کامل

Fast inverse transform sampling in one and two dimensions

We develop a computationally efficient and robust algorithm for generating pseudo-random samples from a broad class of smooth probability distributions in one and two dimensions. The algorithm is based on inverse transform sampling with a polynomial approximation scheme using Chebyshev polynomials, Chebyshev grids, and low rank function approximation. Numerical experiments demonstrate that our ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998